Persistent Contextual Neural Networks for learning symbolic data sequences

نویسنده

  • Yann Ollivier
چکیده

We introduce persistent contextual neural networks (PCNNs) as a probabilistic model for learning symbolic data sequences, aimed at discovering complex algorithmic dependencies in the sequence. PCNNs are similar to recurrent neural networks but feature an architecture inspired by finite automata and a modified time evolution to better model memory effects. An effective training procedure using a gradient ascent in a metric inspired by Riemannian geometry is developed: this produces an algorithm independent from design choices such as the encoding of parameters and unit activities. This metric gradient ascent is designed to have an algorithmic cost close to backpropagation through time for sparsely connected networks. PCNNs are demonstrated to effectively capture a variety of complex algorithmic constraints on hard synthetic problems: basic block nesting as in context-free grammars (an important feature of natural languages, but difficult to learn), intersections of multiple independent Markovtype relations, or long-distance relationships such as the distant-XOR problem. On this problem, PCNNs perform better than more complex state-of-the-art algorithms. Thanks to the metric update, fewer gradient steps and training samples are needed: for instance, a generating model for sequences of the form ab can be learned from only 10 samples in under two minutes, even with n ranging in the thousands. This text is a preliminary version. The problem considered here is to learn a probabilistic model for an observed sequence of symbols (x0, . . . , xt, . . .) over a finite alphabet A. Such a model can be used for prediction, compression, or generalization. Hidden Markov models (HMMs) are frequently used in such a setting. However, the kind of algorithmic structures HMMs can represent is limited because of the underlying finite automaton structure. Below, we discuss examples of simple sequential data that cannot be, or cannot conveniently be, represented by HMMs; for instance, subsequence insertions, or intersections of multiple independent constraints.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Training Knowledge-Based Neural Networks to Recognize Genes in DNA Sequences

We describe the application of a hybrid symbolic/connectionist machine learning algorithm to the task of recognizing important genetic sequences. The symbolic portion of the KBANN system utilizes inference rules that provide a roughly-correct method for recognizing a class of DNA sequences known as eukaryotic splice-junctions. We then map this "domain theory" into a neural network and provide t...

متن کامل

معرفی شبکه های عصبی پیمانه ای عمیق با ساختار فضایی-زمانی دوگانه جهت بهبود بازشناسی گفتار پیوسته فارسی

In this article, growable deep modular neural networks for continuous speech recognition are introduced. These networks can be grown to implement the spatio-temporal information of the frame sequences at their input layer as well as their labels at the output layer at the same time. The trained neural network with such double spatio-temporal association structure can learn the phonetic sequence...

متن کامل

Training Knowledge-Based Neural Networks to Recognize Genes

We describe the application of a hybrid symbolic/connectionist machine learning algorithm to the task of recognizing important genetic sequences. The symbolic portion of the KBANN system utilizes inference rules that provide a roughly-correct method for recognizing a class of DNA sequences known as eukaryotic splice-junctions. We then map this "domain theory" into a neural network and provide t...

متن کامل

A general framework for adaptive processing of data structures

A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive models like artificial neural nets and belief nets for the problem of processing structured infor...

متن کامل

Perspectives on Learning Symbolic Data with Connectionistic Systems

This paper deals with the connection of symbolic and subsymbolic systems. It focuses on connectionistic systems processing symbolic data. We examine the capability of learning symbolic data with various neural architectures which constitute partially dynamic approaches: discrete time partially recurrent neural networks as a simple and well established model for processing sequences, and advance...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1306.0514  شماره 

صفحات  -

تاریخ انتشار 2013